Skip to content

Conversation

@the-mikedavis
Copy link
Collaborator

@the-mikedavis the-mikedavis commented Aug 1, 2024

rabbit_khepri:setup/0 is always called during boot for the sake of clustering. Previously we registered projections during rabbit_khepri:setup/0 meaning that even if you run Rabbit 3.13.x without enabling the khepri_db feature flag you will attempt to register projections. It's a benign bug - it doesn't cause any bad behavior other than a potential timeout on boot if a server is in a minority - but it modifies the Khepri store so we need to consider the projections when we reorganize the tree (#11225).

Instead of registering projections during rabbit_khepri:setup/0 we should register them in two spots:

  • In init (rabbit_db:init/0), a boot step run after rabbit_khepri:setup/0 which has a Khepri-specific branch, and only if we detect that the database is blank. (We could always register on init when Khepri is enabled but there's no need to unless the database is blank.)
  • In the enable callback for the khepri_db feature flag.

In the future when we want to unregister projections, we can add that step before rabbit_khepri:register_projections/0 to the enable callback for the khepri_db feature flag (using rabbitmq/khepri#282).

@the-mikedavis the-mikedavis changed the title rabbit_khepri: Avoid throws in register_projection/0 Move projection registration out of rabbit_khepri:setup/0 Aug 1, 2024
@the-mikedavis the-mikedavis force-pushed the md/khepri/register-projections-khepri-only branch from 5225381 to 90e0fd0 Compare August 3, 2024 15:11
@the-mikedavis the-mikedavis changed the title Move projection registration out of rabbit_khepri:setup/0 Move projection registration to enable callback of khepri_db feature flag Aug 3, 2024
@the-mikedavis the-mikedavis force-pushed the md/khepri/register-projections-khepri-only branch from 86f2651 to 348c6d4 Compare August 6, 2024 15:45
@the-mikedavis the-mikedavis changed the title Move projection registration to enable callback of khepri_db feature flag Move projection registration out of Khepri setup Aug 6, 2024
@the-mikedavis the-mikedavis requested a review from dumbbell August 6, 2024 19:04
@the-mikedavis the-mikedavis marked this pull request as ready for review August 6, 2024 19:04
@the-mikedavis the-mikedavis self-assigned this Aug 6, 2024
@dumbbell
Copy link
Collaborator

In the future when we want to unregister projections, we can add that step before rabbit_khepri:register_projections/0 to the enable callback for the khepri_db feature flag (using rabbitmq/khepri#282).

Do you want to wait for the unregister_projections() API to use it in this pull request? I’m not sure I get all the ins and outs of the upgrade problem.

@the-mikedavis
Copy link
Collaborator Author

Yeah let's wait for that since we also want the fence API to await a leader (as mentioned above)

@dumbbell
Copy link
Collaborator

dumbbell commented Sep 5, 2024

Khepri 0.15.0 is now released and used in RabbitMQ main branch.

@the-mikedavis the-mikedavis force-pushed the md/khepri/register-projections-khepri-only branch 2 times, most recently from 2b247e3 to 56f4489 Compare September 5, 2024 16:08
@the-mikedavis the-mikedavis added this to the 3.13.8 milestone Sep 5, 2024
@the-mikedavis the-mikedavis marked this pull request as ready for review September 5, 2024 22:28
Copy link
Collaborator

@dumbbell dumbbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch looks good to me. I just made a minor style-related comment.

Previously this function threw errors. With this minor refactor we
return them instead so that `register_projection/0` is easier for
callers to work with. (In the child commit we will add another caller.)
This is a cosmetic change. `?RA_CLUSTER_NAME` is equivalent but is used
for clustering commands. Commands sent via the `khepri`/`khepri_adv`
APIs consistently use the `?STORE_ID` macro instead.
This causes a clearer error when the `enable_feature_flags/2` function
returns something not in the shape `ok | {skip, any()}`.
This covers a specific case where we need to register projections not
covered by the enable callback of the `khepri_db` feature flag. The
feature flag may be enabled if a node has been part of a cluster which
enabled the flag, but the metadata store might be reset. Upon init the
feature flag will be enabled but the store will be empty and the
projections will not exist, so operations like inserting default data
will fail when asserting that a vhost exists for example.

This fixes the `cluster_management_SUITE:forget_cluster_node_in_khepri/1`
case when running the suite with `RABBITMQ_METADATA_STORE=khepri`, which
fails as mentioned above.

We could run projection registration always when using Khepri but once
projections are registered the command is idempotent so there's no need
to, and the commands are somewhat large.
`khepri:fence/0,1,2` queries the leader's Raft index and blocks the
caller for the given (or default) timeout until the local member has
caught up in log replication to that index. We want to do this during
Khepri init to ensure that the local Khepri store is reasonably up to
date before continuing in the boot process and starting listeners.

This is conceptually similar to the call to `mnesia:wait_for_tables/2`
during `rabbit_mnesia:init/0` and should have the same effect.
@the-mikedavis the-mikedavis force-pushed the md/khepri/register-projections-khepri-only branch from 6870cf1 to ce72903 Compare September 9, 2024 13:06
Copy link
Collaborator

@dumbbell dumbbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@the-mikedavis the-mikedavis merged commit 4a85433 into main Sep 9, 2024
@the-mikedavis the-mikedavis deleted the md/khepri/register-projections-khepri-only branch September 9, 2024 14:29
@the-mikedavis
Copy link
Collaborator Author

I'll set up the backports once we have the prerequisite PRs backported. In particular this PR needs #11672

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

3 participants